Search CORE

12 research outputs found

Deep Active Learning for Named Entity Recognition

Author: Anandkumar Animashree
Kronrod Yakov
Lipton Zachary C.
Shen Yanyao
Yun Hyokun
Publication venue
Publication date: 03/02/2018
Field of study

Deep learning has yielded state-of-the-art performance on many natural language processing tasks including named entity recognition (NER). However, this typically requires large amounts of labeled data. In this work, we demonstrate that the amount of labeled training data can be drastically reduced when deep learning is combined with active learning. While active learning is sample-efficient, it can be computationally expensive since it requires iterative retraining. To speed this up, we introduce a lightweight architecture for NER, viz., the CNN-CNN-LSTM model consisting of convolutional character and word encoders and a long short term memory (LSTM) tag decoder. The model achieves nearly state-of-the-art performance on standard datasets for the task while being computationally much more efficient than best performing models. We carry out incremental active learning, during the training process, and are able to nearly match state-of-the-art performance with just 25\% of the original training data

arXiv.org e-Print Archive

Dense Information Flow for Neural Machine Translation

Author: He Di
Liu Tie-Yan
Qin Tao
Shen Yanyao
Tan Xu
Publication venue
Publication date: 01/01/2018
Field of study

Recently, neural machine translation has achieved remarkable progress by introducing well-designed deep neural networks into its encoder-decoder framework. From the optimization perspective, residual connections are adopted to improve learning performance for both encoder and decoder in most of these deep architectures, and advanced attention connections are applied as well. Inspired by the success of the DenseNet model in computer vision problems, in this paper, we propose a densely connected NMT architecture (DenseNMT) that is able to train more efficiently for NMT. The proposed DenseNMT not only allows dense connection in creating new features for both encoder and decoder, but also uses the dense attention structure to improve attention quality. Our experiments on multiple datasets show that DenseNMT structure is more competitive and efficient

arXiv.org e-Print Archive

Crossref

Recommended from our members

Simple, efficient and robust approaches for large scale learning

Author: Shen Yanyao
Publication venue
Publication date: 12/05/2021
Field of study

Robustness of a model plays a vital role in large scale machine learning. Classical estimators in robust statistics do not provide satisfied computational efficiency as data size and model scales. We draw ideas from robust statistics and focus on providing simple and efficient algrithmic paradigms for large scale learning that are provably robust to corrupted training samples. We start from standard supervised and unsupervised problems, and then move towards several semi-supervised settings including mixed linear regression as well as multi-instance multi-label learning. We analyze the algorithms under regular statistical settings with mild assumptions, thus providing theoretical supports for applying the ideas to large scale learning models such as deep neural networks. These simple algorithms serve as strong baselines and have achieved state-of-the-art results on certain tasks. The algorithmic paradigm is applicable to a wide range of problems and our theoretical insights may also guide future research on robust large scale learning.Electrical and Computer Engineerin

Texas ScholarWorks

Energy Efficient D2D Communications: A Perspective of Mechanism Design

Author: Chunxiao Jiang
Lei Xu
Tony Q. S. Quek
Yanyao Shen
Yong Ren
Zhu Han
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref